Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update datatable usage #4123

Merged
merged 1 commit into from
Feb 16, 2019
Merged

Update datatable usage #4123

merged 1 commit into from
Feb 16, 2019

Conversation

st-pasha
Copy link
Contributor

This PR ensures that xgboost continues to work correctly with the upcoming changes to the datatable package.

  • Prefer name dt.Frame instead of dt.DataTable (which was deprecated a year ago);
  • dt.Frame.internal namespace will be removed; instead the underlying data pointers can be accessed using datatable.internal.frame_column_data_r() (which returns a ctypes.c_void_p object).

@CodingCat
Copy link
Member

I am not a python expert, but can we have some tests to guard these code with compatibility? (we are missing many in various packages, python/jvm, not sure about R)

@st-pasha
Copy link
Contributor Author

@CodingCat Testing whether xgboost works with different versions of different libraries is possible, but not trivial. Because in python you can have only one version of a library installed at any given time, these tests would necessarily have to install one version, run a test, then reinstall another version, etc. Now, a test is not supposed to alter user's environment -- a better place to implement this entire process is from a CI such as Jenkins or Travis.

@hcho3
Copy link
Collaborator

hcho3 commented Feb 11, 2019

As @st-pasha points out, we should use CI to try out multiple versions of datatable.

@trivialfis
Copy link
Member

trivialfis commented Feb 16, 2019

I'm going to merge this since all current tests are not changed and passed. @hcho3 I doubt the possibility of testing each version of dependencies. We have sklearn, pandas, datatable and later some others for GPU. In the context of Python, easiest way is always staying with the latest version from pip and advice users to do the same (maybe with help from virtualenv or similar tools).

@trivialfis trivialfis merged commit ff2d4c9 into dmlc:master Feb 16, 2019
@hcho3
Copy link
Collaborator

hcho3 commented Feb 16, 2019

@trivialfis Yeah, you're right. Except for major dependencies (e.g. CUDA), we should just use the latest versions of dependencies for testing and ask the users to do the same.

@lock lock bot locked as resolved and limited conversation to collaborators May 17, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants